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Abstract 

In this paper, we reinterprets the (k + 2, fc) Zigzag code in coding matrix and then propose an optimal exact repair strategy 
for its parity nodes, whose repair disk I/O approaches a lower bound derived in this paper. 
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I. Introduction 

Distributed storage systems built on huge numbers of storage nodes have wide applications in peer-to-peer storage systems 
such as OceanStore ED, Total Recall fl] and DHash++ 0. Erasure code, which can provide both protection against node 
failures and efficient data storage, is very common in distributed storage systems El, 0, 0, 0, ED, ED. For instance, as 
a special class of erasure code, RAID -6 is a popular scheme for tolerating any two node failures IfTTI . 

Upon failure of a single node, a self-sustaining system should repair the failed node in order to retain the same redundancy. 
In the literature, there are mainly two repair types: exact repair and functional repair. Compared with the latter, exact repair 
is preferred since it does not incur additional significant system overhead by regenerating the exact replicas of the lost data 
at the failed node 0 . Generally speaking, there are several metrics to evaluate the performance of node repair, such as the 
repair bandwidth, which is defined as the amount of data downloaded from surviving nodes to repair a failed node, the disk 
I/O, which is defined as the amount of data read. 

Recently, Dimakis et al. g) introduced a new class of erasure code for distributed storage systems named minimum storage 
regenerating (MSR) code. The distributed storage system deploys a (k + r, k) MSR code to store a file of size M = kN 
symbols across n nodes, each node keeping N symbols. The (k+r, k) MSR code has the optimal repair property that the repair 
bandwidth 7 = N is minimal, which is achieved by downloading d J£ +1 symbols from each of any k < d < k + r — 1 

surviving nodes when repairing a failed node. In this paper, we only focus on the exact repair of high rate MSR codes. When 
r = 1, the repair bandwidth is the highest, i.e., 7 = M. When r = 2 and d = k + 1, MSR code is very desirable since it 
can achieve the highest rate for 7 = (/.: + l)N/2 < M. In addition, (k + 2, k) MSR code can be alternative to RAID -6 
schemes. 

So far, several explicit constructions of ( k + 2, k) MSR codes have been presented 0, ifTOl . 1131 . II 1 51 . Among them, the 
( k + 2, k) Zigzag code in lfl3l . which is defined by a series of permutations, is of great interest because of: 

(i) Optimal update disk I/O property (also known as optimal update property in ED ) that only itself and one symbol at 
each parity node need an update when a symbol in a systematic node is rewritten; 

(ii) Optimal repair disk I/O property (also known as optimal rebuilding in |fl3l ) for systematic nodes that the repair disk I/O 
of a systematic node is equal to the minimal repair bandwidth; 

(iii) Small alphabet size of 3 so that it can be easily implemented; 

(iv) The storage N = 2 k ~ 1 achieves the theoretic lower bound on the storage per node for (k + 2, /,;) MSR codes with both 
optimal update disk I/O and optimal repair disk I/O for systematic nodes ED. 

However, the parity nodes of the (k + 2, k) Zigzag code was trivially repaired by downloading all the original data in II13S 1. i.e., 
the download bandwidth reaches the maximal value 7 = M. In order to acquire the optimal repair property for both systematic 
nodes and parity nodes, a (k, k — 2) MSR code was presented in [16] based on a modification of the (k + 2, k ) Zigzag code, 
but at cost of sacrificing two systematic nodes while maintaining the same storage per node N = 2 fc_1 . It should be noted that 
only the (k + 2, k) Hadamard MSR code in IfTOl shares the optimally repair property of all the nodes in the all aforementioned 
codes. 

In this paper, without changing the original structure of the (k + 2, k ) Zigzag code, we propose an optimal repair strategy 
for the two parity nodes, whose download bandwidth achieves the minimal value 7 = (k + l)N/2. A comparison of the 
properties of various known ( k + 2, k) MSR codes, such as the Zizag code employing our repair strategy, the original Zigzag 
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code Ifl3l . the modified Zigzag code Ifl6| , and Hadamard code IfTOl , is given in Table U It is seen that the new repair strategy 
does not lose any good properties of the original Zigzag code, for examples, the optimal update disk I/O property, the optimal 
repair disk I/O property for systematic nodes, small alphabet size of 3, and so on. In contrast to the modified Zigzag code 
and Hadamard code with the same optimal repair property of all nodes, the Zigzag code employing the new repair strategy 
shows a clear advantage over the storage per node. Although the repair disk I/O of the parity node is not optimal, which is 
kN + N - k, larger than the minimal repair bandwidth (fc + l)iV/2, it indeed approaches a lower bound on the disk I/O of 
Zigzag code given in this paper. 


TABLE I 

Comparison of the properties of some (fc + 2, fc) MSR codes where q and N denote the size of the finite field required and the 

STORAGE PER NODE, RESPECTIVELY. 
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N 

Optimal Update Disk I/O 

Optimal Repair Disk I/O 

Optimal Repair 

Systematic 

Nodes 

Parity 

Nodes 

Systematic 

Nodes 

Parity 

Nodes 

Zizag Code 

Employing New Repair Strategy 

3 

2 k ~ 1 

Yes 

Yes 

No 

Yes 

Yes 

Original Zigzag Code 11131 

3 

2 fc 1 

Yes 

Yes 

No 

Yes 

No 

Modified Zigzag Code 1161 

3 

2 fe + 1 

No 

Yes 

Yes 

Yes 

Yes 

Hadamard Code jI0| 

2fc + 3 

2 lc+1 

Yes 

No 

No 

Yes 

Yes 


The rest of this paper is organized as follows. Section II introduces the structure of a (k + 2, k) MSR code and the necessary 
and sufficient conditions for optimal repair of parity nodes. Section III proposes the (fc + 2, fc) Zigzag code and reinterprets it 
in coding matrix. In Section IV, a lower bound on disk I/O to optimally repair the parity nodes of the (fc + 2, fc) Zigzag code 
is presented. The optimal repair strategy for the parity nodes of the (fc + 2, fc) Zigzag code is given in Section V. 

II. Optimal repair for parity nodes of (fc + 2, fc) MSR codes 

Let q be a prime power and F g be the finite field with q elements. Assume that a file of size M = kN is equally partitioned 
into fc parts, respectively denoted by fo, fi,..., ffc-i, where ij is a column vector of length N for 0 < j < k. The file is 
encoded to a (fc + 2, fc) MSR code and then stored across fc systematic and two parity storage nodes, each node having storage 
N. The first fc nodes are systematic nodes, which store the file parts fo, fi, • • • , ffc-i in an uncoded form respectively. Without 
loss of generality, assume that the two parity nodes, nodes fc and fc + 1, respectively store ffc = fo + fi + • • • + ffc_i and 
ffc+i = Aofo + Aifi + • • • + Ak- iffc-i for some N x N matrices Aq, ■ ■ ■ , Ak-i over F 9 , where the matrix Aj is called the 
coding matrix for systematic node j, 0 < j < fc. To guarantee the MDS property, it is required that liTOl . H41 

rank(Aj) = rank(Aj — Aj) = N,0 < i ^ j < fc. (1) 

Table I illustrates the structure of a (fc + 2, fc) MSR code. 


TABLE II 

Structure of a (fc + 2, fc) MSR code 


Node 0 

Node 1 


Node k — 1 

Node k 

Node k + 1 

fo 

fi 


ffc-l 

k-1 

ffc = £ f i 
i= 0 

fc -1 

ffc+1 — £ 

i= 0 


When repairing a failed node j, the optimal repair property demands to download half data from each surviving node /, 
0 < l ^ j < k + 2, by multiplying its original data f; with an N/2 x N matrix of rank N/2, called repair matrix. In what 
follows, we review the requirement on repair matrices for the optimal repair of parity nodes of a (fc + 2, fc) MSR code flOi . 

El- 

Upon failure of the first parity node (node fc), respectively downloading S a fj and ,S'„ffc-i, 0 < j < fc, where S a and S a are 
two N/2 x N repair matrices of rank N/2, eventually one gets the following system of linear equations 


Safo 

Safk+1 


S a 

S a A 0 


k -1 

f *-£ 

, i=i 


S a 

*§a(^4o ~ Ai) 


fi. 


useful data interference by fj 

To cancel all the interference terms and then recover the target data ffc, the optimal repair requires M. M 

Sa 

S a A 0 


rank 


= N 


( 2 ) 
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and 

'“*(( 1£i< *' <3) 

Clearly, the disk I/O to optimally repair the first parity node is fc/Vi + IV 2 where Ni and N 2 denote the nonzero columns of 
S a and ,S'„ respectively. 

To repair the second parity node (node k + 1 ), downloading (SbAj)ij and ,SV,f/, : , 0 < j < k, where Sb and Sb are two 
N/2 x N matrices of rank N/2, one obtains the following system of linear equations 


S b f fc 


Sb 

S b Ak 


useful data 


k-1 

f fc+i - 
__, !=l v 


Sb 

S b (A^ - A/ 1 ) 


Aifi. 


interference by f i 


Similarly, optimal repair demands m, m 


rank 


S b 

SbAo 1 


and 


rank 


S b 

SbiAo 1 - AY 1 ) 


= N 


N 


= y, 1 <1 <k. 


(4) 

(5) 


Accordingly, the disk I/O to optimally repair the second parity node is the total number of nonzero columns of Sb and .S'/,/!,, 
0 < i < k. 


III. Reinterpretation of (k + 2, k) Zigzag code in coding matrix 
Throughout this paper, let k > 2 and N = 2 fe_1 . Given an integer 0 < i < N, let (*i, • • • ,ik- 1 ) be its binary expansion, 

k-l 

i.e., % = 2 k ~ 1 ~Hj. For simplicity, we do not distinguish a nonnegative integer i and its binary expansion if the context is 

i-i 

clear. 

Let {ej}^Zl be the standard vector basis over F 2 of dimension k — 1, i.e., 

= ( 0 , • • ’ ) 0 , 1 , 0 , • • • , 0 ), 1 <j<k 

S, y ✓ 

k-l 

with only the jth entry being nonzero. By convenience, set eo to be the all-zero vector. 

In Q3), the (,k + 2, k) Zigzag code is characterized by the following permutation Pj : [0, N — 1] —» [0, N — 1] 

3=0 

1 [Xi,--- ,Xj-i,Xj ®l,Xj+i,--- ,Xk-i), 0 <j<k 

where © denotes the addition in F 2 . Obviously, 

P J _ 1 (cc) = x © ej = Pj{x), 0 < j < k. 

For any integer 0 < l < N, define Zi as Zi = = Pj 1 (/),0 < j < fc}’ be., 


( 6 ) 


Zi = {(*, j)\i = l © ej,0 < j < k} 


k-l 


by (J 6 ]». The structure of the (k + 2, k) Zigzag code is depicted in Table II, where the first parity node stores ^ f l : j 

3-0 

and the second parity node stores fi,k+i = Pi,jfi,j> 0 < i < N and 0 < j < k, Pij = (-l) v ^i=o ei , i.e.. 


Pi,j ~ 


1 , 


if j = 0 

otherwise 


(7) 


In the following, we reinterpret the data stored at the second parity node of the (k + 2, k) Zigzag code in the form of coding 
matrix so that we can use Equations ([2}-([7} to check the optimality of our new repair matrices in the next section. 


.(k) *(k) 

Given an integer k > 2, recursively define k matrices Aq , • • • , A K k _ 1 of order N over F 3 as 


A^=I 2k - 1 , A[ k) = 


— I 2 k-2 


I 2 k-2 




Ak- 1 ) 

A j-1 


_ A (k- 1 ) J for 2 <3 < fc 


( 8 ) 
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TABLE III 

Structure of the (k + 2, k) Zigzag code 


Node 0 


Node k — 1 

Node k 

Node k + 1 

/o,o 


fo,k-i 

k -1 

fo,k = fo,j 

j=0 

/o,fc+l = 

(i,j)ez 0 

4,0 



k -1 

fl,k = E fl,j 

j=0 

fl,k+l = Pijfij 

(i,j)ez 1 






In- i,o 


4v-l,/c-l 

k -1 

4v-l,fc = J2 }n- l,j 

3 =0 

fN-l,k-\-l — Xy 

(i,j)ez N _ 1 


where 



First of all, the following properties of the matrices in (H} are obvious. 

Property 1. For any k > 2, the matrix A^' ' 1 in ([8]) with 1 < j < k satisfies 

(i) (Af)* =-I 2k -i; 

if.) 

(ii) Both each row and each column of A - have only one nonzero entry. 

Next, we show that the matrix A < " ) k> in ® is just the coding matrix for systematic node j of the (k + 2, k) Zigzag code for 
all 0 < j < k. 

Theorem 1. The coding matrices of the (k + 2, k) Zigzag code are Ag k \ ■ ■ ■ , A^\, i.e., 

ffc +1 = 4 fc)f o + ■ • •+4-t f fc-i 

where f ) = (/ 0j , • • ■ , 

Proof: Let All, i] denote the entry at row l and column i of matrix A. By Property l-(ii), equations |6]l and ©, it suffices 
to prove Af\l,Pr 1 {l)) = /3 p -i {l)j , i.e., 

A ( 0 k) (l, l) = A (k) (l, / © e 0 ) = ft,o = 1,0 <1<N (9) 

and 

= = i)' 1 +-+'i+ 1 , l<j<k,0<l<N. (10) 

Obviously, ([9| holds since A^ is the identity matrix and ( ITOb holds for j = 1, i.e., A^ (l, l © ei) = (—1)* 1+1 ,0 < l < N, 
by the definition in ®. 

Hereafter, we prove dTOb for j > 2 by the induction. Suppose that (ITOt holds for k > 2 and 1 < j < k. Then, 

A {k+1 \l,l® ej ) 

= + \(h, ■ ■ ■ , /fc), (h, ■ ■ ■ , © 1, lj+ 1, • • • ,4)) 

= (—l) (l A^ 1 ((?2, • • • , h), (h, • • • 5 © 1) lj +ii ■ • ■ > 4)) 

_ -Mj+l 

for 2 < j < k + 1 and 0 < l < 2 fe , where the last two equalities respectively follow from © and the assumption. 

■ 

IV. Bounds on disk I/O to optimally repair the parity nodes of the Zigzag code 

For a general (k + 2, k) MSR code over F q defined in Table I, Wang et al. lfl7l proved that the minimal disk I/O to repair 
the first and second parity nodes are respectively at least [k + 1) A ; '/2 and kN if q = 2. In fact, the assertion can be proved 
for q > 2 by almost the same proof in ifTTl . 

Specifically for the Zigzag code, in this section we give a more tight bound on the minimal disk I/O for the optimal repair 
of the parity nodes. 

Firstly, we state a connection between the optimal repair strategies for the two parity nodes of the Zigzag code. 

Lemma 1. If and S^ are the repair matrices for the first parity node of the (fc + 2,/c) Zigzag code, then Aj k \ 0 < 

j < k, and S 4) are the repair matrices for the second parity node, and vice versa. 
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Proof: Note from (Q]) and © that = In — A [ fe is nonsingular for 1 < l < k. Then, 


rank 


S (fc) 

S (k) ((4 fc ))-i _ (^W)- 1 ) 


= rank 


= rank 


= rank 


= rank 


= rank 


s( fe ) 

5W(/jv + 4 fc) ) 

5W(/jv + 4 fc) ) ^ 

V 

S W(I N + A[ k) )(I N - A[ k) ) ) 

S (k) 

S^{I N -A\ k) ) 

( S™ 

V S^ k \A (k) -A {k) ) 


( 11 ) 


where in the first and fourth identities we use Property l-(i), i.e., (A ( ^) 2 = —In and then ( A j k ^) 1 = — A) 


(k) 


In addition. 


rank 


£(fc) 

S^iA^)- 1 


= rank 


5( fc ) 

S (k) 


= rank 


S (k) 

SWA^ 


Therefore, the result can be obtained from ©. ©, © and ©. 


Theorem 2. The disk I/O to optimally repair the first or second parity node of the ( k + 2, k) Zigzag code is at least 

kN+^jN. 

Proof: Suppose that s!, k ' and S',!' 1 are two repair matrices for the first parity node of (k + 2, k) Zigzag code. According 

k —3 
2 (fc-l) J 


to the definition of repair disk I/O, we need to prove kNi + > kN + . 2 k ./\, N, where A'| and N 2 respectively denote the 

number of nonzero columns of the matrices Sa'" 1 and Sa k \ 

By © and ©, we have 


rank 


q( k ) 

q(k) .(k) 
On n 


— rank 


y(k) 

’a 

j( k ) 


= N 


and 


rank 


S « 


(fe) 


:( k ) 


S ( a k) (A {k) - Aj k) ) 


= rank 


N 


si k \l N -A\ K) ) 

For 0 < i < N, denote by Sa’’ 1 [i] and Sa fc -*[?'] the column i of and §i k \ Assume that columns ' , *jv-jVi of 

are zero columns. Note that in ( fl3t . ) = rank(S'a fc ' 1 (In — A = N/2. Then, we have that si k \lN — ^4/ fc ^)[*s] = 

*5a^[* s ] — is also a zero column, i.e., 

(S( fe) A| fe) )[i s ] = ^ fe) [i s ] for 1 < l < k and 1 < s < N - Ni. 

Further, it follows from Property l-(ii) and (ITOt that only the (i©e;)th entry in A^ [i] is ±1, which implies (5'^'*^4;) [z s ] = 
±S ( a k) [i s © ei\. Thus, 

S ( a k) [* s © ei] = ±Sl fc) [*,] for 1 < l < k and 1 < s < N - /Vi- (14) 

On the other hand, it is seen from ( 1 1 21 ) that all the columns i\. 12 - ■ ■ • ■ in~n, of s!!’ ’ are linearly independent, which 
indicates that 

{i u © ei : 1 < l < k} 0 {i v © e; : 1 < l < k} = 0 for 1 < u 7 ^ v < N — N±. (15) 

Therefore, applying (f]~4l > and ( fHI ) to rank(5a fc ^) = N/2, we obtain N/2 < N — (k — 1)(N — Ni), i.e., Ni > N — N 
By means of ( 1 1 1 1 ), we can prove N 2 > N — 2 (k-~i) ' n th 6 same fashion. Hence, 

kN ± +N 2 > (,k + 1)(AT - N ) =kN + N- N( t k + }) = kN + ... k [ ~ 3 , N. 




1 < l < k. 


( 12 ) 


(13) 


2 (fc-l) ■ 


2{k — 1) 


2(k - 1) 


2{k — 1 ) 

That is, the assertion is valid for the first parity node. 

For the second parity node of the [k + 2 ,k) Zigzag code, assume that s\ k> , 0 < j < k, and S/ K> are the repair 

(k) \(k) 

matrices. According to the definition, the repair disk I/O is the total number of nonzero columns of the matrices S^ A y - ’ 
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and S'f . k> , 0 < j < k, which is kN\ + _ZV 2 by Property 0(ii), where Ni and N 2 respectively denote the number of nonzero 
columns of the matrices Sj. k> and S^ 1 . By Lemma [T] it is known that S^' 1 and S^ are two repair matrices for the first 


N 


parity node. Therefore, by the analysis for the first parity node we have Ni > N — 0 ^- 1 ) anc l N 2 > N — 2 (k-i) , i.e., 

fcJVi + n 2 > kN + 2 f^N. a 


V. Repair matrices for the parity nodes of the Zigzag code 

In this section, we give the repair matrices for the parity nodes of the (k + 2, k) Zigzag code and verify that they satisfy 

0 . ©, Q) and 0. 

Recursively define the 2 k ~ 2 x 2 fc_1 matrices E ^ and F ^ over F 3 as 


E {k) = 


p(k-l) 


p(k-l) 


p ( fc ) = 


p(k-l) 


p(k-l) 


k > 3 


where 


s (2) = ( 0 -1 ), f (2) = ( -1 0 ) . 


Next recursively define the 2 k 2 x 2 fc 1 matrices and Sa^ over F 3 as 


Sl k) = 


S ^ _1) E^-V 


S t 


(k- 1 ) 


Q(k) — 
5 ^a 


g(k—l) _ 1 ) 




k > 3 


where 


Si 2) = ( 0 1 ) , S< 2 > = ( 1 1 ) . 


(16) 

(17) 

(18) 

(19) 


Proposition 1. For k > 2, rank 


sL k) 

q(k) A ( k) 


= N. 


Proof: When k = 2, the statement is easily checked. For any given k > 2, suppose that the statement is true. According 
to recursive definition in dT 8 t , we have 


rank 


q(k+ 1 ) 


= rank 


= rank 


q(k+l) 

kJa 


( S (k) E « \\ 




V 


kJa 

q(k) 




= rank 


= 2 N 


( ( S { a ] E « \ \ 

S ( a k) -F™ 


V 


'a 




since 


(fc) 

a 

(fc) 


y(k) 

( fe ) 3 ( fc ) 




is an N x iV matrix of full rank. 


Thus, the proof is finished by the above induction. 


(fc) 


Proposition 2. For k >2, rank 


5, 

?(k) / a (fc) 


Proof: When k = 2, the statement is easily checked. When k > 2, by the recursive definitions in 0 and (IT 8 ll . we have 

si k HA k) - 4 fe) ) 

= Si k \l N -A <*>) 


= N/2. 


g(k— 1 ) _ p(k— 1 ) 


^(fc— 1 ) 


tjV /2 


L N/2 


— I 


N/2 


1 N/2 


g(k— 1 ) _|_ p(k— 1 ) g(k—l) p(k— 1 ) 




(fc—1) 


^(fc— 1 ) 
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Therefore, 


rank 


(( 


S^) 

M k) (4 k “-4 k) ) 

q(k—l) 


= rank 


E^- 1 ) \ \ 


S, 


(fe-i) 


g{k—l) l) g{k—l) _ J^(k— 1) 

V -5, 


(fc-1) 


5, 


= rank 


/ 


P ■ 


\ 


S ., 


(fe-i) 


(fe-i) 

q(k—l) 

Oa 


7 


^(fc—1) jp(k-i') £y(k—l) _1) 

i _o(k—1) o(k — 1) 

\ *-?a 


\ \ 

0 




( ( S { a~ X) + ^ (fe_1) 


= rank 


5, 


(fc-i) 


\\ 


Si k ~ x) + F^- 1 ') 


\ 


(k-i) 


£& fc_1) + E( k -V 


q(k—l) 

7a 


= rank 

where the two matrices P, Q are respectively defined by 

P = 

Next, we prove 




rank 


7 


S { a~ X) + FV'-V 


~i(k—l) 


( In/ 4 

In/a 

In/4 0 

, 0=1 

' In/2 —In/2 \ 


—In/4 

—In/4 

V IN/2 ) 

V 


In/4 7 




rank 


si k) + E (fe) 

q(k) 


= rank I 


S ( a k) + F « 




= iV/2 


( 20 ) 


for any k > 2 by the induction. 

f fc+l) 

When k = 2, the statement is easily verified. For any k > 2, suppose that it is true. By the definition of Sa and S ( 
in dm we then have 


(k+i) 


rank 


= rank 


Si k+1) + E^ k+1 "> 


\ v si k+1) 

/ ( S ( a k) + E™ 


s (k) 


/// 


= rank 


N/2 


= rank 

= rank 
= N 


j( k ) 


V 


£( fe ) \ \ 
S (k) + F W 

—pi k ) 


Of 


(fc) 


7 


VV 

/ / si* 0 + 


In /2 In/ 2 
In/2 


In/2 \ 


( S ( a k) +EW 


I N/2 / 

S ( a k) + F™ 


S ( a k) 


E( k ) ^ 

si k) + FW 


q(k) 




w 


S ( a k) + E™ 

o(k) 

&a 


n(k) 


+ rank 


7 


7 

S ( a k) + F™ 


Q' 


(fe) 


/at ~In 
In 






























where the last identity comes from the assumption. Similarly, we can get rank 
the proof after substituted into (l 20 t . 


g(k+i) ^ ^>(fc+i) 


g(fc+i) 


= TV. This completes 


Proposition 3. Given fc > 3, rank 


s( k ) 

S { a k) {A (k) - Af } ) 


= TV/2 for all 2 < i < k. 


Proof: If k = 3, the statement is obvious. For any k > 3, assume that it is true for all 2 < j < k. When j > 2, according 


to the definitions of A^ k+1 ' > in © and S, 


:(fc+1) ^( fe+1) jn 


rank 


q( k +i) 

<Ja 


J(fc+1) I A (^+1) 


SiT ±J (A^ 

for three TV x TV matrices 


- A 


(/c+i) % 


= rank 


5, 


(fc+i) 


j(fe+i) 


(hN ~ Af +1) ) 


= rank 


jji k ) 

3 3 , 

y(k) 

3 


Uf = 


5, 


(fe) 


_ S^In-A^) J 

by the recursive definitions which satisfy 


V - (fc) = 
’ V 1 


g(k) 

si k \l N + Af2 1 ) 


= 

’ 3 


E( k ) 

-FW{I n + Af2j 


= 

3 


~u\ k) + R^vj k) , if j = 2 
U {k) Q( k ) - p(k) V l k) , if j > 2 


where 


R(k) = 


/ 

Ojv / 4 

In/ 4 

In/4 

0n/4 

\ 


/ 

0n/4 

0n/4 

0n/4 

On/4 


—In/ 4 

0n/4 

0n/4 

In/4 


, P (fc) = 


In/4 

0n/4 

0n/4 

0n/4 


0^/4 

0n/4 

0n/4 

In/4 



0n/4 

0n/4 

0n/4 

0n/4 

V 

0 jV /4 

0n/4 

—In/4 

0n/4 

) 


V 

0n/4 

0n/4 

In/4 

0n/4 


Q(k) _ ( OiV/2 Ojv/2 
\ In/2 0]V/2 


and 0 |\| denotes the zero matrix of order N. 
Hence, 


rank 


= rank 


5, 


(fc+i) 


(fc+l) / A (fc + 1) 




0 A 3 


(fc+1) \ 


5, 


(fc) 


Si k \l N - A^) 


i( fc ) 


rank 


g(k) 

Si k \l N + Af } 1 ) 


for j > 2 . 

Further, note from © that A ^ — A^^ = I N — is nonsingular if j > 2. Then, 


rank 

= rank 
= N/2 


s {k) 


S {k \l N + A^\) 


(k) 


S t 


(k) 


si k \l N - A)^) 


i( fe ) 


( 21 ) 


where in the first identity we use (fTTb and the in last identity we use the assumption if j > 3 and Proposition [2] if j = 2. This 
completes the proof after substituted into (I2T1) . 


The following main result is immediate. 

Theorem 3. si k) and si k ^ that defined by (IT6l) . (ITTb . (ITSl) and (IT9l are the repair matrices for the first parity node of the 
(k + 2, k) Zigzag code, whose repair disk I/O is kN + N — k. 

Proof: The optimal repair property of repair matrices Sa'' and S, k 1 is obvious from Propositions Q] [2] and [3] 

(k) ~(k) 

Note that there is only one zero column in Sa and no zero columns in ,S„ , which means TV — 1 elements should be read 
in each of the systematic nodes and all the N elements should be read in the second parity node to repair the first parity node. 
Thus the disk I/O to repair the first parity node is kN + TV — fc. ■ 

By Lemma [Q the second parity node of the (fc + 2, fc) Zigzag code can also be optimally repaired. However, if we use 
Sj k) Hp) , 0 < i < k and sj k> as the repair matrices, where sj k> = sjf' 1 and S^ = S^ are defined by (IT6l) . (flTb . (IT8l) and 
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m, then its repair disk I/O will be kN + N — 1 since Si k has only one zero column and S 1 ,/' 1 A] 1 ' has no zero columns 
for 0 < / < k. In the following, by choosing another initial values of E^ 2 \ F^ 2 \ Si 2 \ Si 2 ' 1 in (fTTb and (IT9l . the disk I/O to 
optimally repair the second parity node can also be reduced to kN + N k. 

Reset 

E™ = ( -1 0 ) , F< 2 > = ( 0 -1 ) , S< 2 > = ( i —! ), = ( 0 1 ) , ( 22 ) 


then we have the following result. 

Theorem 4. Let Sa'’ 1 and Si^ be defined by (l22t . (fl6l > and (ITSl . then Sj k> , 0 < i < k and S^ are the repair matrices 

(k) ~(k) ~(k') (k) 

for the second parity node of the (fc + 2, A;) Zigzag code where Sl =Sa and = Sa . Moreover, the disk I/O to 
optimally repair the second parity node is kN + N — k. 

Proof: Firstly, it can be easily verified that the results in Propositions Q] [2] and [3] are also hold for Sa' 1 and S'i!' 1 defined 
from the initial values E^ 2 \ F^ 2 \ S^ and S^ in (l22l >. Secondly, it follows from Lemma Q] that si^ A[ k \ 0 < i < k and 
si ^ are the repair matrices for the second parity node of the (k + 2, k) Zigzag code. ■ 

From Theorems [3] and [4] it is seen that the disk I/O to optimally repair the parity nodes of the Zigzag code is very close 
to the lower bound given in Lemma [2] 

Finally, we give some examples of the repair matrices for the parity nodes of the (k + 2, k) Zigzag code. 


Example 1. The first parity node of the (5,3) Zigzag code, (6,4) Zigzag code, and (7,5) Zigzag code, can be respectively 
optimally repaired by the following matrices 







c( 3 ) 


In 

1 0 

-M 


J 

1 

1 

1 


0 

) 








a 


l 0 

0 1 

1 J 

1 

~ a 

\ 

0 

0 

0 

1 , 

> 





( 0 

i 

0 


-1 

0 

— 1 0 0 > 




/ 1 

1 


1 

0 

1 

0 


0 

0 \ 

0 

0 

1 


1 

0 

0 - 

1 0 


s w 


0 

0 


0 

1 

0 

0 


0 

1 

0 

0 

0 


0 

1 

1 1 0 

1 



0 

0 


0 

0 

0 

1 


0 

— 

1 

V o 

0 

0 


0 

0 

0 0 1 ) 




1 0 

0 


0 

0 

0 

0 


1 

1 

/ 


/ 

0 

l 

0 

-1 

0 - 

1 0 

0 

0 

-1 

0 


0 


0 

0 

0 


0 

\ 




0 

0 

1 

1 

0 0-1 

0 

0 

0 

-1 

0 


0 

0 

0 


0 





0 

0 

0 

0 

1 

1 

0 

0 

0 

0 


0 


-1 

0 

0 


0 





0 

0 

0 

0 

0 0 0 

1 

0 

0 

0 


0 


0 

0 

0 


-1 





0 

0 

0 

0 

0 0 0 

0 

1 

1 

1 


0 


1 

0 

0 


0 


1 



0 

0 

0 

0 

0 0 0 

0 

0 

0 

0 


1 


0 

0 

0 


1 





0 

0 

0 

0 

0 0 0 

0 

0 

0 

0 


0 


0 

1 

0 


-1 




V 

0 

0 

0 

0 

0 0 0 

0 

0 

0 

0 


0 


0 

0 

1 


1 

) 



/ 

1 

1 

1 

0 

1 0 

0 0 

1 

0 

0 

0 

0 


0 


0 

0 

\ 





0 

0 

0 

1 

0 0 

0 1 

0 

0 

0 

1 

0 


0 


0 

0 






0 

0 

0 

0 

0 1 

0 -1 

0 

0 

0 

0 

0 


1 


0 

0 






0 

0 

0 

0 

0 0 

1 1 

0 

0 

0 

0 

0 


0 


1 

0 






0 

0 

0 

0 

0 0 

0 0 

0 

1 

0 

-1 

0 



L 

0 

0 






0 

0 

0 

0 

0 0 

0 0 

0 

0 

1 

1 

0 


0 


-1 

0 






0 

0 

0 

0 

0 0 

0 0 

0 

0 

0 

0 

1 


1 


1 

0 





V 

0 

0 

0 

0 

0 0 

0 0 

0 

0 

0 

0 

0 


0 


0 

1 

) 




The second parity node of the (5,3) Zigzag code, (6,4) Zigzag code, and (7,5) Zigzag code, can be respectively optimally 
repaired by the following matrices 


(3) _ ( 0 1 0 1 \ o(3) _ ( 1 -1 -1 0 \ 

^ 0 0 1 -1 ) ’ ** - V 0 0 0 1 ) 


( 0 

1 

0 

1 

0 

1 

0 

0 ^ 


( 1 

-1 

-1 

0 

-1 

0 

0 

0 \ 

0 

0 

1 

-1 

0 

0 

1 

0 

s/ ] = 

0 

0 

0 

1 

0 

0 

0 

-1 

0 

0 

0 

0 

1 

-1 

-1 

0 


0 

0 

0 

0 

0 

1 

0 

1 

V 0 

0 

0 

0 

0 

0 

0 

1 ) 


^ 0 

0 

0 

0 

0 

0 

1 

- 1 / 
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010101 0001 0 0 0 0 0 0 \ 

001 -1 00 1000 100000 

0000 1 -1 -1 000 00 1 000 

000000 0100 000001 

000 0 0 0 0 01 -1 -1 0 -1 00 0 

000000 0000 01000 -1 

000000 0000 000101 

000000 0000 0 0 0 0 1 -1 / 

1 -1 -1 0 -1 00 0 -1 00 0 0 0 0 0 \ 

00 01000 -1 000 -1 00 00 

00 000101 00000 -1 00 

00 00001 -1 000000 -1 0 

00 000000 010101 00 ' 

00 000000 001 -1 00 10 

00 000000 00001 -1 -1 0 

00 000000 000000 01 / 
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